AITopics | step hold

The recently introduced optimizer, Muon, has gained increasing attention due to its superior performance across a wide range of applications. However, its effectiveness in federated learning remains unexplored. To address this gap, this paper investigates the performance of Muon in the federated learning setting. Specifically, we propose a new algorithm, FedMuon, and establish its convergence rate for nonconvex problems. Our theoretical analysis reveals multiple favorable properties of FedMuon. In particular, due to its orthonormalized update direction, the learning rate of FedMuon is independent of problem-specific parameters, and, importantly, it can naturally accommodate heavy-tailed noise. The extensive experiments on a variety of neural network architectures validate the effectiveness of the proposed algorithm.

assumption 3, convergence rate, fedmuon, (10 more...)

arXiv.org Machine Learning

2510.03866

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

b63a24a1832bd14fa945c71f535c0095-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-18-2025, 02:27:44 GMT

artificial intelligence, learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre:

Workflow (0.68)
Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.47)

Add feedback

fdc42b6b0ee16a2f866281508ef56730-Supplemental.pdf

Neural Information Processing SystemsAug-17-2025, 10:41:48 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

Neural Information Processing SystemsAug-16-2025, 16:47:30 GMT

We study risk-sensitive reinforcement learning (RL) based on the entropic risk measure. Although existing works have established non-asymptotic regret guarantees for this problem, they leave open an exponential gap between the upper and lower bounds. We identify the deficiencies in existing algorithms and their analysis that result in such a gap.

algorithm, bellman equation, equation, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

High dimensional online calibration in polynomial time

Peng, Binghui

arXiv.org Machine LearningApr-12-2025

In online (sequential) calibration, a forecaster predicts probability distributions over a finite outcome space $[d]$ over a sequence of $T$ days, with the goal of being calibrated. While asymptotically calibrated strategies are known to exist, they suffer from the curse of dimensionality: the best known algorithms require $\exp(d)$ days to achieve non-trivial calibration. In this work, we present the first asymptotically calibrated strategy that guarantees non-trivial calibration after a polynomial number of rounds. Specifically, for any desired accuracy $\epsilon > 0$, our forecaster becomes $\epsilon$-calibrated after $T = d^{O(1/\epsilon^2)}$ days. We complement this result with a lower bound, proving that at least $T = d^{\Omega(\log(1/\epsilon))}$ rounds are necessary to achieve $\epsilon$-calibration. Our results resolve the open questions posed by [Abernethy-Mannor'11, Hazan-Kakade'12]. Our algorithm is inspired by recent breakthroughs in swap regret minimization [Peng-Rubinstein'24, Dagan et al.'24]. Despite its strong theoretical guarantees, the approach is remarkably simple and intuitive: it randomly selects among a set of sub-forecasters, each of which predicts the empirical outcome frequency over recent time windows.

artificial intelligence, machine learning, step follow, (18 more...)

arXiv.org Machine Learning

2504.09096

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Building a stable classifier with the inflated argmax

Soloff, Jake A., Barber, Rina Foygel, Willett, Rebecca

arXiv.org Machine LearningMay-22-2024

We propose a new framework for algorithmic stability in the context of multiclass classification. In practice, classification algorithms often operate by first assigning a continuous score (for instance, an estimated probability) to each possible label, then taking the maximizer -- i.e., selecting the class that has the highest score. A drawback of this type of approach is that it is inherently unstable, meaning that it is very sensitive to slight perturbations of the training data, since taking the maximizer is discontinuous. Motivated by this challenge, we propose a pipeline for constructing stable classifiers from data, using bagging (i.e., resampling and averaging) to produce stable continuous scores, and then using a stable relaxation of argmax, which we call the "inflated argmax," to convert these scores to a set of candidate labels. The resulting stability guarantee places no distributional assumptions on the data, does not depend on the number of classes or dimensionality of the covariates, and holds for any base classifier. Using a common benchmark data set, we demonstrate that the inflated argmax provides necessary protection against unstable classifiers, without loss of accuracy.

algorithm, artificial intelligence, machine learning, (20 more...)

arXiv.org Machine Learning

2405.14064

Country: